Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Chain Data tables #63

Merged
merged 26 commits into from
Dec 29, 2023

Conversation

juan518munoz
Copy link
Contributor

from #5

Add block_headers and chain_mmr_nodes tables with their needed insertion and getter methods to interact with the Store.

@juan518munoz juan518munoz marked this pull request as ready for review December 14, 2023 12:54
src/client/mod.rs Outdated Show resolved Hide resolved
src/client/mod.rs Outdated Show resolved Hide resolved
Comment on lines 89 to 94
-- Create chain mmr nodes
CREATE TABLE chain_mmr_nodes(
id INTEGER, -- in-order index of the internal MMR node (autoincrements)
node BLOB NOT NULL, -- internal node value (hash)
PRIMARY KEY (id)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bobbinth you mentioned that we would probably need to refactor ChainMmr in Miden base to have it work on a partial MMR (I'm guessing this TODO is related).
Right now the RPC returns partial MMR deltas AFAIK, and it should be enough to maintain a full MMR on the client side by adding new leaves, correct? Can you expand a little bit more on the motivation and how it could work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe depending on how frequently the client makes requests, we may not get all the leaves and the client won't be able to build the full MMR.

There is an issue to fix this 0xPolygonMiden/miden-base#112 - I'll work on this and it'll be ready by Monday.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I left a few comments inline explaining in more detail how the overall structure should work. It may require a bit more explanation though - so, let happy to talk through it.

src/store/store.sql Outdated Show resolved Hide resolved
src/store/store.sql Show resolved Hide resolved
src/store/chain_data.rs Outdated Show resolved Hide resolved
src/store/chain_data.rs Outdated Show resolved Hide resolved
impl Store {
// CHAIN DATA
// --------------------------------------------------------------------------------------------
pub fn insert_block_header(&mut self, block_header: BlockHeader) -> Result<(), StoreError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the previous comments, this will need to take an additional parameter for chain_mmr_peaks (this would be Vec<Digest>).

Copy link
Contributor

@bobbinth bobbinth Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now thinking about this, we might also need to pass the forest (u64) to this as well because our goal is to be able to reconstruct PartialMmr that the client keeps track of. This requires:

  • forest: u64 - this would be in the block_headers table.
  • peaks: Vec<RpoDigest> - this would also be in the block_headers table.
  • nodes: BTreeMap<InOrderIndex, RpoDigest> - this would come from the chain_mmr_nodes table.
  • track_latest: bool- not sure where this should come from yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? forest can be derived from peaks and the PartialMmr constructor takes only the peaks

Comment on lines 53 to 78
pub fn insert_chain_mmr_node(&mut self, chain_mmr: ChainMmr) -> Result<(), StoreError> {
let node = serialize_chain_mmr(chain_mmr)?;

const QUERY: &str = "INSERT INTO chain_mmr_nodes (node) VALUES (?)";

self.db
.execute(QUERY, params![node])
.map_err(StoreError::QueryError)
.map(|_| ())
}

pub fn get_chain_mmr_hash_by_id(&self, id: u64) -> Result<ChainMmr, StoreError> {
const QUERY: &str = "SELECT id, node FROM chain_mmr_nodes WHERE id = ?";
self.db
.prepare(QUERY)
.map_err(StoreError::QueryError)?
.query_map(params![id as i64], parse_chain_mmr_nodes_columns)
.map_err(StoreError::QueryError)?
.map(|result| {
result
.map_err(StoreError::ColumnParsingError)
.and_then(parse_chain_mmr_nodes)
})
.next()
.ok_or(StoreError::ChainMmrNodeNotFound(id))?
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure these methods are correct. chain_mmr_nodes table is meant to store nodes from PartialMmr struct. Basically, every row in this table would be a single entry in the nodes map of PartialMmr.

We probably don't want to insert the whole partial MMR every time - but rather only insert the resulting from each update.

So, the methods here should probably be a bit lower-level. Something like:

pub fn insert_chain_mmr_nodes(&mut self, nodes: Vec<(InOrderIndex, Digest)>) -> Result<(), StoreError>` {

}

/// Returns all nodes in the table.
pub fn get_chain_mmr_nodes(&mut self) -> Result<BTreeMap<InOrderIndex, Digest>, StoreError> {

}

/// Gets a list of nodes required to reconstruct authentication paths for the specified blocks.
///
/// This will be used for `get_transaction_data()` method of `DataStore`.
pub fn get_chain_mmr_paths(
  &mut self,
  block_numbers: &[u32]
) -> Result<Vec<(InOrderIndex, Digest)>, StoreError> {

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to check that I undestand correctly:

insert_chain_mmr_nodes should receive only the nodes derived from each update, iterating over them, inserting each pair (InOrderIndex, Digest) in the table.

get_chain_mmr_nodes retrieves all rows from the table, and puts them inside a BTreeMap.

get_chain_mmr_paths only retrieves the rows on the table which InOrderIndex matches any of the elements of block_numbers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert_chain_mmr_nodes should receive only the nodes derived from each update, iterating over them, inserting each pair (InOrderIndex, Digest) in the table.

This is correct. In the naive implementation this should be pretty simple as we can just take everything that was added to the nodes map after the last inserted node (new nodes would always have a bigger index).

get_chain_mmr_nodes retrieves all rows from the table, and puts them inside a BTreeMap.

Correct.

get_chain_mmr_paths only retrieves the rows on the table which InOrderIndex matches any of the elements of block_numbers.

It is a bit more involved as for each block number we'll need to return all nodes in the path from the block to the root of the corresponding peak. Let's leave this for another PR.

Copy link
Contributor Author

@juan518munoz juan518munoz Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation of InOrderIndex it seems like there's no way to serialize this type, and neither access it's inner usize, am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently missing but I'm adding it in 0xPolygonMiden/crypto#238.

src/store/mod.rs Outdated Show resolved Hide resolved
src/client/chain_data.rs Show resolved Hide resolved
Comment on lines 138 to 140
self.store
.apply_state_sync(new_block_num, new_nullifiers)
.apply_state_sync(new_block_num, new_nullifiers, new_block_header)
.map_err(ClientError::StoreError)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general comment about sync_state function: my original intent was for it to work slightly differently. Specifically:

The sync_state request to the node gives us the next block containing requested data. It also gives us chain_tip which is the latest block number in the chain. So, unless response.block_header.block_num == response.chain_tip we haven't synced to the tip of the chain yet.

The idea was that we'd make these requests in a loop until response.block_header.block_num == response.chain_tip, at which point we know that we've fully synchronized with the chain.

Each request also brings us info about new notes, nullifiers etc. created. It also returns Chain MMR delta that we can use to update the state of Chain MMR. This includes both chain MMR peaks and chain MMR nodes.

A naive way to update chain MMR is to load full PartialMmr at the beginning of this method and then call apply on it for every response. There is a better way to do it - though, the details require more thought.

Copy link
Contributor Author

@juan518munoz juan518munoz Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add the naive implementation to leave it in a working state, and change it a better one if we can come up with some.

@bobbinth
Copy link
Contributor

A few more comments about the overall syncing procedure (some applicable to this PR, but many probably for future PRs):

First, as already mentioned in one of the comments above, the idea is that we make requests to the sync_state endpoint until response.chain_tip == response.block_header.block_num. In the requests, we send:

  • block_num from state_sync table.
  • A list of all unique account IDs from the accounts table.
  • A list of tags from the state_sync table.
  • A list of nullifier prefixes for all un-consumed input notes.

For every response we get, we do the following:

  1. Update block_num in the state_sync table to response.block_header.block_num.
  2. Check if the returned account hashes match latest account hashes in the database. If they don't match, something got corrupted and we won't be able to execute transactions against accounts where there is a state mismatch.
  3. For any consumed nullifiers update corresponding input notes. This also implies that transactions in which these notes were created have also been committed and thus we need to update their state and states of involved accounts accordingly.
  4. Update input notes table based on the returned notes. Here, we'll assume that we already have most of the note's details in the table (these notes could be imported previously via a side channel or created locally). But these notes would be missing anchor info (e.g., location in the chain and inclusion path). So, basically, for every returned note:
    a. We look up a note record by note hash in input_notes table. If no note is found, we just move to the next returned note.
    b. If a note is found, we update it's anchor info. This will make this note consumable because now we build the inclusion proof for the note (which is required to execute a transaction).
  5. If the response brought back any relevant notes (e.g., the ones that were not ignored in the previous step), we also need to update our chain data tables. Specifically, we need to insert a new block header (from response.block_header) and also update chain_mmr_nodes table. The simplest way to do this is to maintain in memory representation of PartialMmr struct which contains info from these tables.

@igamigo
Copy link
Collaborator

igamigo commented Dec 22, 2023

FYI, I'm handling some of Bobbin's comments on the syncing procedure on #68 (except the account hash checks and maintaining the MMR which are more related to this PR)

@igamigo
Copy link
Collaborator

igamigo commented Dec 26, 2023

Pushed some changes that use the newer changes of the crypto crate (applying the delta now returns the new nodes and we can use that to insert to the table). Further changes to the sync state procedure are left for future PRs (some of them in #68).

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you!

src/store/accounts.rs Outdated Show resolved Hide resolved
@igamigo igamigo merged commit 653b3a9 into 0xPolygonMiden:main Dec 29, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants